An Approach towards Optimizing Random Forest using Dynamic Programming Algorithm

نویسندگان

  • Vrushali Y Kulkarni
  • Aashu Singh
  • Pradeep K Sinha
چکیده

Random Forest (RF) is an ensemble supervised machine learning technique. Based on bagging and random feature selection, number of decision trees (base classifiers) is generated and majority voting is taken among them. The size of RF is subjective and varies from one dataset to another. Furthermore due to the randomization induced during creation, and its huge size, RF has at best been described as black-box. Various changes based on the experimental results have been proposed in the algorithm since then to optimize the performance of RF. To this end, we aim to find a subset, having accuracy comparable to original RF but having a much smaller size. In this paper, we show that the problem of selection of optimal subset of random forest follows the dynamic programming paradigm. Applying this approach to various UCI data-sets, corresponding subsets are obtained and studied. We conclude that such subsets do exist and that they are not unique. Moreover the size of these subsets is small fraction of the original RF (in the range of tens) and that accuracy of these subsets is a discrete valued function over its range.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic Dynamic Programming with Markov Chains for Optimal Sustainable Control of the Forest Sector with Continuous Cover Forestry

We present a stochastic dynamic programming approach with Markov chains for optimal control of the forest sector. The forest is managed via continuous cover forestry and the complete system is sustainable. Forest industry production, logistic solutions and harvest levels are optimized based on the sequentially revealed states of the markets. Adaptive full system optimization is necessary for co...

متن کامل

Designing a new multi-objective fuzzy stochastic DEA model in a dynamic ‎environment to estimate efficiency of decision making units (Case Study: An Iranian Petroleum Company)

This ‎paper presents a new multi-objective fuzzy stochastic data envelopment analysis model          (MOFS-DEA) under mean chance constraints and common weights to estimate the efficiency of decision making units for future financial periods of them. In the initial MOFS-DEA ‏model, the outputs and inputs are ‎characterized by random triangular fuzzy variables with normal distribution, in which ...

متن کامل

A New Dynamic Random Fuzzy DEA Model to Predict Performance of Decision Making Units

Data envelopment analysis (DEA) is a methodology for measuring the relative efficiency of decision making units (DMUs) which ‎consume the same types of inputs and producing the same types of outputs. Believing that future planning and predicting the ‎efficiency are very important for DMUs, this paper first presents a new dynamic random fuzzy DEA model (DRF-DEA) with ‎common weights (using...

متن کامل

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

Optimizing image steganography by combining the GA and ICA

In this study, a novel approach which uses combination of steganography and cryptography for hiding information into digital images as host media is proposed. In the process, secret data is first encrypted using the mono-alphabetic substitution cipher method and then the encrypted secret data is embedded inside an image using an algorithm which combines the random patterns based on Space Fillin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013